Name | Version | Summary | date |
runlocal-hub |
0.1.4 |
Python client for benchmarking and validating ML models on real devices via RunLocal API |
2025-07-28 19:06:22 |
mqt.bench |
2.0.1 |
MQT Bench - A MQT tool for Benchmarking Quantum Software Tools |
2025-07-28 12:34:27 |
novaeval |
0.4.0 |
A comprehensive, open-source LLM evaluation framework for testing and benchmarking AI models |
2025-07-22 19:20:41 |
py-perf-jg |
0.2.0 |
A lightweight Python performance tracking library with automatic data collection and visualization |
2025-07-22 01:03:01 |
django-concurrent-test |
1.0.0 |
Production-ready Django package for safe and configurable concurrent testing with isolated databases, timing analytics, and concurrency simulation middleware |
2025-07-14 00:25:24 |
AgentDS-Bench |
1.2.2 |
Python client for AgentDS-Bench: A streamlined benchmarking platform for evaluating AI agent capabilities in data science tasks |
2025-07-09 21:21:17 |
benchwise |
0.1.0a1 |
The GitHub of LLM Evaluation - Python SDK |
2025-07-08 10:16:01 |
guidellm |
0.2.1 |
Guidance platform for deploying and managing large language models. |
2025-04-29 17:49:39 |
actbench |
0.0.1a5 |
A framework for evaluating web automation agents and LAM systems. |
2025-02-27 23:49:24 |
examinationrag |
0.1.4 |
XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanced Retrieval-Augmented Generation |
2025-02-07 04:58:08 |
airflow-parse-bench |
1.0.1 |
Easily measure and compare your Airflow DAGs' parse time. |
2025-01-26 03:39:23 |
nxbench |
0.1.24 |
A centralized benchmarking suite to facilitate comparative profiling of tools across graph analytic libraries and datasets |
2024-12-28 08:14:36 |
jax-hpc-profiler |
0.2.9 |
HPC Plotter and profiler for benchmarking data made for JAX |
2024-11-29 19:38:37 |
cmdbench |
0.1.22 |
Quick and easy benchmarking for any command's CPU, memory, disk usage and runtime. |
2024-11-20 06:53:26 |
multi-comp-matrix |
0.0.2 |
Multi Comparison Matrix: A long term approach to benchmark evaluations |
2024-11-04 17:02:29 |
flow-judge |
0.1.2 |
A small yet powerful LM Judge |
2024-10-29 07:32:52 |
miRBench |
1.0.0 |
A collection of datasets and predictors for benchmarking miRNA target site prediction algorithms |
2024-10-15 11:37:58 |
posebench |
0.5.0 |
Comprehensive benchmarking of protein-ligand structure generation methods |
2024-09-30 16:22:19 |
mlos-viz |
0.6.1 |
Visualization Python interface for benchmark automation and optimization results. |
2024-08-16 18:15:54 |
mlos-bench |
0.6.1 |
MLOS Bench Python interface for benchmark automation and optimization. |
2024-08-16 18:15:52 |